iDISQUE: Tuning High-Dimensional Similarity Queries in DHT Networks

نویسندگان

  • Xiaolong Zhang
  • Lidan Shou
  • Kian-Lee Tan
  • Gang Chen
چکیده

In this paper, we propose a fully decentralized framework called iDISQUE to support tunable approximate similarity query of high dimensional data in DHT networks. The iDISQUE framework utilizes a distributed indexing scheme to organize data summary structures called iDisques, which describe the cluster information of the data on each peer. The publishing process of iDisques employs a locality-preserving mapping scheme. Approximate similarity queries can be resolved using the distributed index. The accuracy of query results can be tuned both with the publishing and query costs. We employ a multi-probe technique to reduce the index size without compromising the effectiveness of queries. We also propose an effective load-balancing technique based on multi-probing. Experiments on real and synthetical datasets confirm the effectiveness and efficiency of iDISQUE.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CISS: An Efficient Object Clustering Framework for DHT-Based Peer-to-Peer Applications

In most DHT-based peer-to-peer systems, objects are totally declustered since such systems use a hash function to distribute objects evenly. However, such an object de-clustering can result in significant inefficiencies in advanced access operations such as multi-dimensional range queries, continuous updates, etc, which are common in many emerging peer-to-peer applications. In this paper, we pr...

متن کامل

Implementing Dynamic Querying Search in k-ary DHT-based Overlays

Distributed Hash Tables (DHTs) provide scalable mechanisms for implementing resource discovery services in structured Peer-to-Peer (P2P) networks. However, DHT-based lookups do not support some types of queries which are fundamental in several classes of applications. A way to support arbitrary queries in structured P2P networks is implementing unstructured search techniques on top of DHT-based...

متن کامل

A Short Survey on P2P Data Indexing

P2P data indexing has recently attracted a great many research efforts. For various proposed schemes, there are generally two taxonomies: 1) From a systematic point of view, existing schemes fall into two categories: the over-DHT indexing paradigm, which as a layered manner, indexes data in DHT key space (i.e., over DHT), and the overlay-dependent indexing paradigm, which indexes data directly ...

متن کامل

DHTJoin: Processing Continous Join Queries using DHT Networks

This paper addresses the problem of computing approximate answers to continuous join queries. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries exploiting the trees formed by the underlying DHT links. DHTJoin distributes the query workload across multiple DHT nodes and provides a mechanism that a...

متن کامل

Enabling Dynamic Querying over Distributed Hash Tables

Dynamic querying (DQ) is a search technique used in unstructured peer-topeer (P2P) networks to minimize the number of nodes that is necessary to visit to reach the desired number of results. In this paper we introduce the use of the DQ technique in structured P2P networks. In particular, we present a P2P search algorithm, named DQ-DHT (Dynamic Querying over a Distributed Hash Table), to perform...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010